Mixing and Merging for Spoken Document Retrieval
نویسندگان
چکیده
This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to nd the best use of speech recogniser output to produce the highest retrieval e ectiveness. Second, investigating the potential problems of retrieving from a so-called \mixed collection", i.e. one that contains documents from both a speech recognition system (producing many errors) and from hand transcription (producing presumably near perfect documents). The result of the rst part of the work found that merging the transcripts of multiple recognisers showed most promise. The investigation in the second part showed how the term weighting scheme used in a retrieval system was important in determining whether the system was a ected detrimentally when retrieving from a mixed collection.
منابع مشابه
TREC - 7 Experiments at the University
The University of Maryland participated in three TREC-7 tasks: ad hoc retrieval, cross-language retrieval, and spoken document retrieval. The principal focus of the work was evaluation of merging techniques for cross-language text retrieval from mixed language collections. The results show that biasing the merging strategy in favor of documents in the query language can be helpful. Ad hoc and s...
متن کاملTREC Experiments at the University of Maryland
The University of Maryland participated in three TREC tasks ad hoc retrieval cross language retrieval and spoken document retrieval The principal focus of the work was evaluation of merging techniques for cross language text retrieval from mixed language collections The results show that biasing the merging strategy in favor of documents in the query language can be helpful Ad hoc and spoken do...
متن کاملTREC-7 Experiments at the University of Maryland
The University of Maryland participated in three TREC-7 tasks: ad hoc retrieval, cross-language retrieval, and spoken document retrieval. The principal focus of the work was evaluation of merging techniques for cross-language text retrieval from mixed language collections. The results show that biasing the merging strategy in favor of documents in the query language can be helpful. Ad hoc and s...
متن کاملMixing and Merging for Spoken
This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to nd the best use of speech recogniser output to produce the highest retrieval eeectiveness. Second, investigating the potential problems of retrieving from a so-called \mixed collec-tion", i.e. one that contains documents from bo...
متن کاملAT&T at TREC-6: SDR Track
In the spoken document retrieval track, we study how higher word-recall|recognizing many of the spoken words|aaects the retrieval eeectiveness for speech documents, given that high word-recall comes at a cost of low word-precision|recognizing many words that were not actually spoken. We hypothesize that information retrieval algorithms would beneet from a higher word-recall and are robust again...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998